Frequent pattern mining on stream data using Hadoop CanTree-GTree
نویسندگان
چکیده
منابع مشابه
Distributed and Stream Data Mining Algorithms for Frequent Pattern Discovery
The use of distributed systems is continuously spreading in several applications domains. Extracting valuable knowledge from raw data produced by distributed parties, in order to produce a unified global model, may presents various challenges related to either the huge amount of managed data or their physical location and ownership. In case data are continuously produced (stream) and their anal...
متن کاملBig Data Frequent Pattern Mining
Frequent pattern mining is an essential data mining task, with a goal of discovering knowledge in the form of repeated patterns. Many efficient pattern mining algorithms have been discovered in the last two decades, yet most do not scale to the type of data we are presented with today, the so-called “Big Data”. Scalable parallel algorithms hold the key to solving the problem in this context. In...
متن کاملFrequent Sets Mining in Data Stream Environments
In recent years, data streams have emerged as a new data type that has attracted much attention from the data mining community. They arise naturally in a number of applications (Brian et al., 2002), including financial service (stock ticker, financial monitoring), sensor networks (earth sensing satellites, astronomic observations), web tracking and personalization (webclick streams). These stre...
متن کاملFrequent Itemset Mining over Stream Data: Overview
During the past decade, stream data mining has been attracting widespread attentions of the experts and the researchers all over the world and a large number of interesting research results have been achieved. Among them, frequent itemset mining is one of main research branches of stream data mining with a fundamental and significant position. In order to further advance and develop the researc...
متن کاملApproximate Frequent Pattern Discovery Over Data Stream
Frequent pattern discovery over data stream is a hard problem because a continuously generated nature of stream does not allow a revisit on each data element. Furthermore, pattern discovery process must be fast to produce timely results. Based on these requirements, we propose an approximate approach to tackle the problem of discovering frequent patterns over continuous stream. Our approximatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Procedia Computer Science
سال: 2017
ISSN: 1877-0509
DOI: 10.1016/j.procs.2017.09.134